Lab 2

## Loading required package: maptools
## Loading required package: sp
## Checking rgeos availability: TRUE
## Loading required package: RColorBrewer
## Loading required package: MASS
## Loading required package: rgeos
## rgeos version: 0.5-1, (SVN revision 614)
##  GEOS runtime version: 3.6.1-CAPI-1.10.1 
##  Linking to sp version: 1.3-1 
##  Polygon checking: TRUE
## Loading required package: Rcpp

Question1

Make maps of the study area shaded by the population density and the daily variability of PM2.5 in the year of 2011. The two variables are respectively named as pop.den, pm25.std in the spatial polygon data frame (en.county).

Make map of population density

en.county1<-spTransform(en.county, "+proj=longlat +datum=WGS84 +no_defs")

m <- leaflet(en.county1) %>%
  setView(-78.80,42.90, 9) %>%
  addProviderTiles("MapBox", options = providerTileOptions(
    id = "mapbox.light",
    accessToken = Sys.getenv('MAPBOX_ACCESS_TOKEN')))
bins<-c(0,1,2,3,4,5,Inf)
pal<-colorBin("Set1", domain = en.county$pop.den, bins = bins)

pop.den_map <- m%>%
  addTiles()%>%
  addPolygons(
    fillColor = ~pal(en.county$pop.den),
    weight = 0.5,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7)%>%
    addLegend(pal = pal, values = ~en.county$pop.den, opacity = 0.5, title = 'Population Density in Buffalo',
            position = "bottomright")
pop.den_map

Discussion of the map:

In most rural areas, the population density in red are lower than 1 and the place are located in the rural areas, which are far away from the Buffalo center. And the high population density in light yellow which are higher than 5, are concentrated in the central Buffalo. The population density gets higher with the decline of the distance between their areas and Buffalo center.

Make map of the daily variability of PM 2.5 in 2011

m <- leaflet(en.county1) %>%
  setView(-78.80,42.90, 9) %>%
  addProviderTiles("MapBox", options = providerTileOptions(
    id = "mapbox.light",
    accessToken = Sys.getenv('MAPBOX_ACCESS_TOKEN')))
bins<-c(4,4.2,4.4,4.6,4.8,5,5.2,Inf)
pal<-colorBin("Spectral", domain = en.county$pm25.std, bins = bins)

pm.25_map <- m%>%
  addTiles()%>%
  addPolygons(
    fillColor = ~pal(en.county$pm25.std),
    weight = 0.5,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7)%>%
  addLegend(pal = pal, values = ~en.county$pm25.std, opacity = 0.5, title = 'Daily variability of PM 2.5',
            position = "bottomright")
pm.25_map

Discussion of the map:

From the map, we can find out that the areas with lowest PM 2.5 variability (<4.2) in red are located in the southern-east Buffalo areas. And the highest daily variability(>5) of PM 2.5 in green are distributed in the northern-west part of Buffalo, near the Niagara falls and the boundary bewteen the Canada and US. From the northern-west corner to the southern-east corner, the color in the maps turns from green to red, which means the daily variability of PM 2.5 decrease gradually in spatial.

Question2

Solve the p-median problem when p value is ranged from 3 to 15 stations (p = 3,4,…,15).

set.seed(12345)
distance_total = c(rep(0,13))
set.seed(12345)
for (i in 3:15){
  solution.i <- allocations(en.county, p=i)
  unique(solution.i$allocation)%>%print()
  distance_total[i-2]= sum(unique(solution.i$allocdist))
}
## [1]  55 287 178
## [1]  18  17 284 178
## [1]  18  49 288 245 159
## [1]  14  17 296 246 159 137
## [1]  14  49 225 275  88 159 124
## [1]  14  49 224 297 167 195 183 197
## [1]  14  49 207 275  83 181 148  89 195
##  [1]  14  49  13 276 210 156 195 125 104 197
##  [1]  14  49  13  76 210 173 195 184 281 104 140
##  [1]  14  49  31 188 292 274 156 149 125 119 195 147
##  [1]  14  49  13 188  76 210 156 181 184 281 119 195 140
##  [1]  14  49  13 188  76 210 156 181 125 281 119 195 147 241
##  [1]  14   2  49  13 188  76 210 156 181 125 281 119 195 147 241
distance_total
##  [1] 2971383 2564183 2246062 2045417 1849296 1734819 1658696 1568602
##  [9] 1480025 1415789 1338536 1274066 1232094
p_facility = c(3:15)
TotalDistance = distance_total/100000
par(mar = rep(2, 4))

Report the solutions for each value of p in terms of weighted distance and the sites selected for facilities:

When the p is 3, the total weighted distance is 2971383, and the selected sites are 55, 287, 178. When the p is 4, the total weighted distance is 2564183, and the selected sites are 18, 17, 284, 178. When the p is 5, the total weighted distance is 2246062, and the selected sites are 18, 49, 288, 245, 159. When the p is 6, the total weighted distance is 2045417, and the selected sites are 14, 17, 296, 246, 159, 137. When the p is 7, the total weighted distance is 1849296, and the selected sites are 14, 49, 225, 275, 88, 159, 124. When the p is 8, the total weighted distance is 1734819, and the selected sites are 14, 49, 224, 297, 167, 195, 183, 197. When the p is 9, the total weighted distance is 1658696, and the selected sites are 14, 49, 207, 275, 83, 181, 148, 89, 195. When the p is 10, the total weighted distance is 1568602, and the selected sites are 14, 49, 13, 276, 210, 156, 195, 125, 104, 197. When the p is 11, the total weighted distance is 1480025, and the selected sites are 14, 49, 13, 76, 210, 173, 195, 184, 281, 104, 140. When the p is 12, the total weighted distance is 1415789, and the selected sites are 14, 49, 31, 188, 292, 274, 156, 149, 125, 119, 195, 147. When the p is 13, the total weighted distance is 1338536, and the selected sites are 14, 49, 13, 188, 76, 210, 156, 181, 184, 281, 119, 195, 140. When the p is 14, the total weighted distance is 1274066, and the selected sites are 14, 49, 13, 188, 76, 210, 156, 181, 125, 281, 119, 195, 147, 241. When the p is 15, the total weighted distance is 1232094, and the selected sites are 14, 2, 49, 13, 188, 76, 210, 156, 181, 125, 281, 119, 195, 147, 241

The site: 14, 49, 13, 188, 210, 188, 195, 119, 147, 241 are consistently identified as facilities.

Plot the tradeoff curve of values, Y axis for weighted distance and X axis for p

ggplot(,aes(p_facility, distance_total))+geom_line() +geom_point()+xlim(0,15)

The map of the spider diagram when p = 6

set.seed(12345)
solution.6 <- allocations(en.county1, p=6)
unique(solution.6$allocation)%>%print()
## [1]  14  49 292 167 146 183
#distance_total6= sum(unique(solution.6$allocdist))
#distance_total6

plot_p6 = plot(star.diagram(solution.6),col='darkblue',lwd=2,add=FALSE)

distance_total6_seperate= unique(solution.6$allocdist)
#distance_total6_seperate

n <- leaflet(en.county1) %>%
  setView(-78.80,42.90, 9) %>%
  addProviderTiles("MapBox", options = providerTileOptions(
    id = "mapbox.light",
    accessToken = Sys.getenv('MAPBOX_ACCESS_TOKEN')))
bins<-c(0,0.02,0.04,0.06,0.08,0.10,0.12,0.14,0.16,0.18,0.2,Inf)
pal<-colorBin("Spectral", domain = solution.6$allocdist, bins = bins)

A=star.diagram(solution.6)


map1 <- n%>%
  addTiles()%>%
  addPolygons(
    fillColor = ~pal(solution.6$allocdist),
    weight = 0.5,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7)%>%
  addLegend(pal = pal, values = ~solution.6$allocdist, opacity = 0.4, title = 'The allocation distance',
            position = "bottomright")%>%
  addPolylines(data=A,
              color = '#F00',
              weight = 2)
map1

For the p = 6 solution, the best solution uses sites 14 49 292 167 146 183.

Question3

For the p = 5 solution, the best solution uses sites 18, 49, 121, 159, 230 Graduate students:

1)Solve the p-median problem using allocations in order to find the best solution which does not involve using all five of these sites

set.seed(12345)
encounty2 = en.county[-c(18,49,121,159,230),]
solution.5_new <- allocations(en.county, encounty2, p =5)
unique(solution.5_new$allocation)
## [1]  14   6 225 119 146
  1. How does the best 5 facility solution compare to the one that doesnt use all of the best sites, in terms of weighted distance, and in terms of sites used?

Distance and sites when use all of the best sites

set.seed(12345)
solution.5 <- allocations(en.county,p =5)
unique(solution.5$allocation)
## [1]  18  49 288 245 159
distance_total5= sum(unique(solution.5$allocdist))
distance_total5
## [1] 2246062

When we use all the best sites, the sites are 18 49 288 245 159. And the weighted distance are total 2246062.

Distance and sites when use the sites which does not involve using all five of these sites: 18, 49, 121, 159, 230

set.seed(12345)
encounty2 = en.county[-c(18,49,121,159,230),]
solution.5_new <- allocations(en.county, encounty2, p =5)
unique(solution.5_new$allocation)
## [1]  14   6 225 119 146
distance_total5_new= sum(unique(solution.5_new$allocdist))
distance_total5_new
## [1] 2266776

When we do not use all the best sites, the sites are 14 6 225 119 146. And the weighted distance are total 2266776. It’s obvious that the weighted distance not under the case of best sites are much larger than the weighted distance under the best sites.

  1. Map these two solutions using the spider diagram and report the id of five sites for each solution.

Map when use all of the best sites

set.seed(12345)
solution.5_1 <- allocations(en.county1,p =5)
unique(solution.5_1$allocation)
## [1]  18  49 296 125 191
distance_total5_1= sum(unique(solution.5_1$allocdist))
distance_total5_1
## [1] 23.65503
n <- leaflet(en.county1) %>%
  setView(-78.80,42.90, 9) %>%
  addProviderTiles("MapBox", options = providerTileOptions(
    id = "mapbox.light",
    accessToken = Sys.getenv('MAPBOX_ACCESS_TOKEN')))
bins<-c(0,0.02,0.04,0.06,0.08,0.10,0.12,0.14,0.16,0.18,0.2,Inf)
pal<-colorBin("Spectral", domain = solution.5_1$allocdist, bins = bins)

B=star.diagram(solution.5_1)

map2 <- n%>%
  addTiles()%>%
  addPolygons(
    fillColor = ~pal(solution.5_1$allocdist),
    weight = 0.5,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7)%>%
  addLegend(pal = pal, values = ~solution.5_1$allocdist, opacity = 0.7, title = 'The allocation distance',
            position = "bottomright")%>%
  addPolylines(data=B,
               color = '#F00',
               weight = 2)
map2

Map when use the sites which does not involve using all five of these sites: 18, 49, 121, 159, 230

set.seed(12345)
encounty2 = en.county1[-c(18,49,121,159,230),]
solution.5_new_1 <- allocations(en.county1, encounty2, p =5)
unique(solution.5_new_1$allocation)
## [1]  14   9 287 106 187
distance_total5_new_1= sum(unique(solution.5_new_1$allocdist))
distance_total5_new_1
## [1] 23.84556
n <- leaflet(en.county1) %>%
  setView(-78.80,42.90, 9) %>%
  addProviderTiles("MapBox", options = providerTileOptions(
    id = "mapbox.light",
    accessToken = Sys.getenv('MAPBOX_ACCESS_TOKEN')))
bins<-c(0,0.02,0.04,0.06,0.08,0.10,0.12,0.14,0.16,0.18,0.2,Inf)
pal<-colorBin("Spectral", domain = solution.5_new_1$allocdist, bins = bins)

C=star.diagram(solution.5_new_1)

map3 <- n%>%
  addTiles()%>%
  addPolygons(
    fillColor = ~pal(solution.5_new_1$allocdist),
    weight = 0.5,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7)%>%
  addLegend(pal = pal, values = ~solution.5_new_1$allocdist, opacity = 0.7, title = 'The allocation distance',
            position = "bottomright")%>%
  addPolylines(data=C,
               color = '#F00',
               weight = 2)
map3

Question 5

For the p = 5 solution, find the best solution which takes into account the population density of each demand zone. Make the spider diagram of the solution with population density as a weight and compare these maps with the solution based on Euclidian distance (above)

set.seed(12345)
weight = en.county$pop.den
eucdists<-euc.dists(en.county, en.county)
weighted_distance = matrix(0,297,297)
for (i in 1:297){
  weighted_distance[i,] = weight[i]*eucdists[i,]
}
#weighted_distance
solution.5_weighteddistance = allocations(en.county, p=5, metric = weighted_distance)
unique(solution.5_weighteddistance$allocation)
## [1]   9 232 218  74 156
distance_total5_weighteddistance= sum(unique(solution.5_weighteddistance$allocdist))
distance_total5_weighteddistance
## [1] 2089158

The best sites when the populaion density is identified as weight : 9 232 218 74 156.

Make the spider diagram when consider the effects of weights

set.seed(12345)
weight_new = en.county1$pop.den
eucdists_new<-euc.dists(en.county1, en.county1)
weighted_distance_new = matrix(0,297,297)
for (i in 1:297){
  weighted_distance_new[i,] = weight_new[i]*eucdists_new[i,]
}

solution.5_weighteddistance_new = allocations(en.county1, p=5, metric = weighted_distance_new)
unique(solution.5_weighteddistance_new$allocation)
## [1] 218   9 211  74 156
p <- leaflet(en.county1) %>%
  setView(-78.80,42.90, 9) %>%
  addProviderTiles("MapBox", options = providerTileOptions(
    id = "mapbox.light",
    accessToken = Sys.getenv('MAPBOX_ACCESS_TOKEN')))
bins<-c(0,0.02,0.04,0.06,0.08,0.10,0.12,0.14,0.16,0.18,0.2,Inf)
pal<-colorBin("Spectral", domain = solution.5_weighteddistance_new$allocdist, bins = bins, reverse = TRUE)

k=star.diagram(solution.5_weighteddistance_new)

map4 <- p%>%
  addTiles()%>%
  addPolygons(
    fillColor = ~pal(solution.5_weighteddistance_new$allocdist),
    weight = 0.5,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7)%>%
  addLegend(pal = pal, values = ~solution.5_weighteddistance_new$allocdist, opacity = 0.7, title = 'The allocation distance',
            position = "bottomright")%>%
  addPolylines(data=k,
               color = '#F00',
               weight = 2)
map4

For the p = 5 solution, the best solution under the weighted distance uses sites 9, 232, 218, 74, 156. And the total weighted distance is 2089158, which is lower than the distance without considering the weight. When we compare this map which is made with the weighted distance and the above map which is only depent on the euclidian distance, we can find out the best sites are more concentrated in this map.